Overview

Dataset statistics

Number of variables27
Number of observations49247
Missing cells2363
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.1 MiB
Average record size in memory216.0 B

Variable types

Numeric14
Categorical13

Alerts

DIRNAME has a high cardinality: 6843 distinct valuesHigh cardinality
CONAME has a high cardinality: 495 distinct valuesHigh cardinality
CUSIP has a high cardinality: 495 distinct valuesHigh cardinality
ADDRESS has a high cardinality: 492 distinct valuesHigh cardinality
CITY has a high cardinality: 229 distinct valuesHigh cardinality
ZIP has a high cardinality: 384 distinct valuesHigh cardinality
SICDESC has a high cardinality: 178 distinct valuesHigh cardinality
NAICSDESC has a high cardinality: 203 distinct valuesHigh cardinality
INDDESC has a high cardinality: 122 distinct valuesHigh cardinality
TICKER has a high cardinality: 495 distinct valuesHigh cardinality
CASH_FEES is highly overall correlated with TOTAL_SECHigh correlation
STOCK_AWARDS is highly overall correlated with TOTAL_SECHigh correlation
TOTAL_SEC is highly overall correlated with CASH_FEES and 1 other fieldsHigh correlation
SUB_TELE is highly overall correlated with STATEHigh correlation
NAICS is highly overall correlated with SICHigh correlation
SIC is highly overall correlated with NAICSHigh correlation
STATE is highly overall correlated with SUB_TELEHigh correlation
SPCODE is highly imbalanced (96.5%)Imbalance
STATE has 2082 (4.2%) missing valuesMissing
STOCK_AWARDS is highly skewed (γ1 = 221.6263592)Skewed
OPTION_AWARDS is highly skewed (γ1 = 48.18993794)Skewed
NONEQ_INCENT is highly skewed (γ1 = 198.0277207)Skewed
PENSION_CHG is highly skewed (γ1 = 58.1177729)Skewed
OTHCOMP is highly skewed (γ1 = 146.2004733)Skewed
TOTAL_SEC is highly skewed (γ1 = 221.0081778)Skewed
CASH_FEES has 3378 (6.9%) zerosZeros
STOCK_AWARDS has 7479 (15.2%) zerosZeros
OPTION_AWARDS has 42377 (86.0%) zerosZeros
NONEQ_INCENT has 49088 (99.7%) zerosZeros
PENSION_CHG has 47593 (96.6%) zerosZeros
OTHCOMP has 31272 (63.5%) zerosZeros
TOTAL_SEC has 1693 (3.4%) zerosZeros

Reproduction

Analysis started2023-05-08 04:15:34.250172
Analysis finished2023-05-08 04:15:57.080767
Duration22.83 seconds
Software versionydata-profiling vv4.0.0
Download configurationconfig.json

Variables

GVKEY
Real number (ℝ)

Distinct495
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42838.429
Minimum1045
Maximum316056
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:15:57.170606image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1045
5-th percentile1913
Q15742
median11228
Q361483
95-th percentile171007
Maximum316056
Range315011
Interquartile range (IQR)55741

Descriptive statistics

Standard deviation59686.274
Coefficient of variation (CV)1.3932881
Kurtosis1.2466855
Mean42838.429
Median Absolute Deviation (MAD)8058
Skewness1.5516093
Sum2.1096641 × 109
Variance3.5624513 × 109
MonotonicityIncreasing
2023-05-08T00:15:57.272048image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
149070 264
 
0.5%
11856 179
 
0.4%
5047 170
 
0.3%
8007 161
 
0.3%
7647 157
 
0.3%
8245 156
 
0.3%
3144 154
 
0.3%
3243 153
 
0.3%
4723 153
 
0.3%
184500 152
 
0.3%
Other values (485) 47548
96.6%
ValueCountFrequency (%)
1045 127
0.3%
1075 117
0.2%
1078 117
0.2%
1161 101
0.2%
1209 101
0.2%
1230 100
0.2%
1300 112
0.2%
1327 79
0.2%
1380 128
0.3%
1440 127
0.3%
ValueCountFrequency (%)
316056 38
 
0.1%
294524 113
0.2%
260774 108
0.2%
189491 87
0.2%
187697 58
0.1%
187450 58
0.1%
186989 99
0.2%
186310 82
0.2%
185532 68
0.1%
184996 74
0.2%

DIRNBR
Real number (ℝ)

Distinct32
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.9952078
Minimum1
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:15:57.369514image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile12
Maximum32
Range31
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.5664305
Coefficient of variation (CV)0.59488021
Kurtosis0.80657253
Mean5.9952078
Median Absolute Deviation (MAD)3
Skewness0.67740138
Sum295246
Variance12.719426
MonotonicityNot monotonic
2023-05-08T00:15:57.451462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
1 4792
9.7%
2 4790
9.7%
3 4788
9.7%
4 4776
9.7%
5 4733
9.6%
6 4631
9.4%
7 4459
9.1%
8 4139
8.4%
9 3627
7.4%
10 3005
6.1%
Other values (22) 5507
11.2%
ValueCountFrequency (%)
1 4792
9.7%
2 4790
9.7%
3 4788
9.7%
4 4776
9.7%
5 4733
9.6%
6 4631
9.4%
7 4459
9.1%
8 4139
8.4%
9 3627
7.4%
10 3005
6.1%
ValueCountFrequency (%)
32 1
 
< 0.1%
31 2
 
< 0.1%
30 3
 
< 0.1%
29 4
 
< 0.1%
28 4
 
< 0.1%
27 5
< 0.1%
26 5
< 0.1%
25 7
< 0.1%
24 10
< 0.1%
23 11
< 0.1%

DIRNAME
Categorical

Distinct6843
Distinct (%)13.9%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
Shirley Ann Jackson
 
45
Michael L. Eskew
 
41
Alexis M. Herman
 
40
Suzanne M. Nora Johnson
 
40
Edward M. Liddy, M.B.A.
 
38
Other values (6838)
49043 

Length

Max length78
Median length67
Mean length19.037545
Min length7

Characters and Unicode

Total characters937542
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique657 ?
Unique (%)1.3%

Sample

1st rowRoger T. Staubach
2nd rowAnn McLaughlin Korologos
3rd rowJudith Rodin, Ph.D.
4th rowDavid L. Boren
5th rowRay M. Robinson, Jr.

Common Values

ValueCountFrequency (%)
Shirley Ann Jackson 45
 
0.1%
Michael L. Eskew 41
 
0.1%
Alexis M. Herman 40
 
0.1%
Suzanne M. Nora Johnson 40
 
0.1%
Edward M. Liddy, M.B.A. 38
 
0.1%
Roxanne S. Austin 38
 
0.1%
Patricia F. Russo 38
 
0.1%
Steven S. Reinemund 36
 
0.1%
Richard H. Lenny 35
 
0.1%
Susan C. Schwab 35
 
0.1%
Other values (6833) 48861
99.2%

Length

2023-05-08T00:15:57.559915image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
j 3732
 
2.3%
a 3641
 
2.3%
m 2948
 
1.8%
jr 2799
 
1.7%
john 2597
 
1.6%
l 2457
 
1.5%
r 2140
 
1.3%
ph.d 2084
 
1.3%
c 2032
 
1.3%
robert 2020
 
1.3%
Other values (6918) 134816
83.6%

Most occurring characters

ValueCountFrequency (%)
112019
 
11.9%
e 70978
 
7.6%
a 64422
 
6.9%
r 56194
 
6.0%
n 52714
 
5.6%
. 52582
 
5.6%
o 42375
 
4.5%
i 41959
 
4.5%
l 39764
 
4.2%
h 27544
 
2.9%
Other values (50) 376991
40.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 572645
61.1%
Uppercase Letter 183084
 
19.5%
Space Separator 112019
 
11.9%
Other Punctuation 68310
 
7.3%
Dash Punctuation 760
 
0.1%
Open Punctuation 363
 
< 0.1%
Close Punctuation 361
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 70978
12.4%
a 64422
11.2%
r 56194
9.8%
n 52714
9.2%
o 42375
 
7.4%
i 41959
 
7.3%
l 39764
 
6.9%
h 27544
 
4.8%
s 27171
 
4.7%
t 25821
 
4.5%
Other values (16) 123703
21.6%
Uppercase Letter
ValueCountFrequency (%)
J 17559
 
9.6%
M 16043
 
8.8%
A 14055
 
7.7%
C 13318
 
7.3%
D 13038
 
7.1%
S 12272
 
6.7%
B 11109
 
6.1%
R 10887
 
5.9%
P 10613
 
5.8%
L 8442
 
4.6%
Other values (16) 55748
30.4%
Other Punctuation
ValueCountFrequency (%)
. 52582
77.0%
, 15496
 
22.7%
' 222
 
0.3%
/ 10
 
< 0.1%
Space Separator
ValueCountFrequency (%)
112019
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 760
100.0%
Open Punctuation
ValueCountFrequency (%)
( 363
100.0%
Close Punctuation
ValueCountFrequency (%)
) 361
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 755729
80.6%
Common 181813
 
19.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 70978
 
9.4%
a 64422
 
8.5%
r 56194
 
7.4%
n 52714
 
7.0%
o 42375
 
5.6%
i 41959
 
5.6%
l 39764
 
5.3%
h 27544
 
3.6%
s 27171
 
3.6%
t 25821
 
3.4%
Other values (42) 306787
40.6%
Common
ValueCountFrequency (%)
112019
61.6%
. 52582
28.9%
, 15496
 
8.5%
- 760
 
0.4%
( 363
 
0.2%
) 361
 
0.2%
' 222
 
0.1%
/ 10
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 937542
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
112019
 
11.9%
e 70978
 
7.6%
a 64422
 
6.9%
r 56194
 
6.0%
n 52714
 
5.6%
. 52582
 
5.6%
o 42375
 
4.5%
i 41959
 
4.5%
l 39764
 
4.2%
h 27544
 
2.9%
Other values (50) 376991
40.2%

CASH_FEES
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct8754
Distinct (%)17.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean97.297325
Minimum0
Maximum4100.385
Zeros3378
Zeros (%)6.9%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:15:57.668938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q165.5
median98
Q3122.283
95-th percentile180.8294
Maximum4100.385
Range4100.385
Interquartile range (IQR)56.783

Descriptive statistics

Standard deviation71.03079
Coefficient of variation (CV)0.73003847
Kurtosis490.96227
Mean97.297325
Median Absolute Deviation (MAD)27
Skewness12.643894
Sum4791601.3
Variance5045.3731
MonotonicityNot monotonic
2023-05-08T00:15:57.765521image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3378
 
6.9%
100 1973
 
4.0%
110 1253
 
2.5%
120 1139
 
2.3%
90 1116
 
2.3%
115 1030
 
2.1%
75 1028
 
2.1%
125 943
 
1.9%
80 758
 
1.5%
85 725
 
1.5%
Other values (8744) 35904
72.9%
ValueCountFrequency (%)
0 3378
6.9%
0.007 2
 
< 0.1%
0.015 3
 
< 0.1%
0.021 2
 
< 0.1%
0.023 1
 
< 0.1%
0.03 1
 
< 0.1%
0.031 1
 
< 0.1%
0.037 2
 
< 0.1%
0.04 2
 
< 0.1%
0.061 1
 
< 0.1%
ValueCountFrequency (%)
4100.385 1
 
< 0.1%
3450 1
 
< 0.1%
3275.385 1
 
< 0.1%
2451.236 1
 
< 0.1%
2395 1
 
< 0.1%
1800 3
< 0.1%
1625 2
< 0.1%
1575 4
< 0.1%
1250 3
< 0.1%
1000 4
< 0.1%

STOCK_AWARDS
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct7540
Distinct (%)15.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean170.82242
Minimum0
Maximum1927510.7
Zeros7479
Zeros (%)15.2%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:15:57.866307image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q175.264
median129.376
Q3170.52
95-th percentile265.629
Maximum1927510.7
Range1927510.7
Interquartile range (IQR)95.256

Descriptive statistics

Standard deviation8688.9804
Coefficient of variation (CV)50.865574
Kurtosis49160.19
Mean170.82242
Median Absolute Deviation (MAD)45.624
Skewness221.62636
Sum8412491.7
Variance75498380
MonotonicityNot monotonic
2023-05-08T00:15:57.970256image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7479
 
15.2%
150 984
 
2.0%
100 612
 
1.2%
125 582
 
1.2%
160 526
 
1.1%
120 499
 
1.0%
140 467
 
0.9%
175 462
 
0.9%
130 453
 
0.9%
110 313
 
0.6%
Other values (7530) 36870
74.9%
ValueCountFrequency (%)
0 7479
15.2%
0.311 1
 
< 0.1%
0.512 1
 
< 0.1%
0.598 1
 
< 0.1%
0.715 1
 
< 0.1%
0.878 1
 
< 0.1%
1.272 2
 
< 0.1%
1.444 1
 
< 0.1%
1.712 1
 
< 0.1%
2.377 1
 
< 0.1%
ValueCountFrequency (%)
1927510.711 1
< 0.1%
43560 1
< 0.1%
27516.225 1
< 0.1%
7397.669 1
< 0.1%
6081.243 1
< 0.1%
4333.238 1
< 0.1%
3437.426 1
< 0.1%
3249.972 1
< 0.1%
3249.971 1
< 0.1%
3249.961 1
< 0.1%

OPTION_AWARDS
Real number (ℝ)

SKEWED  ZEROS 

Distinct1579
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.872265
Minimum0
Maximum23098.558
Zeros42377
Zeros (%)86.0%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:15:58.356096image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile99.986
Maximum23098.558
Range23098.558
Interquartile range (IQR)0

Descriptive statistics

Standard deviation263.91548
Coefficient of variation (CV)11.055318
Kurtosis3053.2969
Mean23.872265
Median Absolute Deviation (MAD)0
Skewness48.189938
Sum1175637.4
Variance69651.38
MonotonicityNot monotonic
2023-05-08T00:15:58.457610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 42377
86.0%
50 73
 
0.1%
65 69
 
0.1%
70 57
 
0.1%
75 45
 
0.1%
110 33
 
0.1%
120 32
 
0.1%
100 32
 
0.1%
23.996 31
 
0.1%
50.001 30
 
0.1%
Other values (1569) 6468
 
13.1%
ValueCountFrequency (%)
0 42377
86.0%
0.27 1
 
< 0.1%
0.337 1
 
< 0.1%
0.394 1
 
< 0.1%
0.436 1
 
< 0.1%
0.44 1
 
< 0.1%
0.571 1
 
< 0.1%
0.621 1
 
< 0.1%
0.711 1
 
< 0.1%
0.763 1
 
< 0.1%
ValueCountFrequency (%)
23098.558 1
< 0.1%
20410.945 1
< 0.1%
17677.5 1
< 0.1%
13330.035 1
< 0.1%
13286.158 1
< 0.1%
13136.634 1
< 0.1%
11716.441 1
< 0.1%
9872.745 1
< 0.1%
9753.005 1
< 0.1%
9005.547 1
< 0.1%

NONEQ_INCENT
Real number (ℝ)

SKEWED  ZEROS 

Distinct57
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.36994229
Minimum0
Maximum6926.502
Zeros49088
Zeros (%)99.7%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:15:58.554930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum6926.502
Range6926.502
Interquartile range (IQR)0

Descriptive statistics

Standard deviation32.526511
Coefficient of variation (CV)87.923202
Kurtosis41814.683
Mean0.36994229
Median Absolute Deviation (MAD)0
Skewness198.02772
Sum18218.548
Variance1057.9739
MonotonicityNot monotonic
2023-05-08T00:15:58.649751image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 49088
99.7%
275 11
 
< 0.1%
24.3 11
 
< 0.1%
27.75 10
 
< 0.1%
19.8 9
 
< 0.1%
28.2 9
 
< 0.1%
17.85 9
 
< 0.1%
28.35 9
 
< 0.1%
30 9
 
< 0.1%
18.75 8
 
< 0.1%
Other values (47) 74
 
0.2%
ValueCountFrequency (%)
0 49088
99.7%
0.15 2
 
< 0.1%
0.189 1
 
< 0.1%
2.781 1
 
< 0.1%
3.026 1
 
< 0.1%
3.743 1
 
< 0.1%
4.442 2
 
< 0.1%
4.594 1
 
< 0.1%
5.076 1
 
< 0.1%
5.575 1
 
< 0.1%
ValueCountFrequency (%)
6926.502 1
 
< 0.1%
1283.392 1
 
< 0.1%
900.154 1
 
< 0.1%
458.341 1
 
< 0.1%
432.6 1
 
< 0.1%
343.962 1
 
< 0.1%
275 11
< 0.1%
200 1
 
< 0.1%
166.146 3
 
< 0.1%
132.376 1
 
< 0.1%

PENSION_CHG
Real number (ℝ)

SKEWED  ZEROS 

Distinct1561
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0544507
Minimum-805.309
Maximum2420
Zeros47593
Zeros (%)96.6%
Negative14
Negative (%)< 0.1%
Memory size384.9 KiB
2023-05-08T00:15:58.751004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-805.309
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum2420
Range3225.309
Interquartile range (IQR)0

Descriptive statistics

Standard deviation19.820273
Coefficient of variation (CV)18.796775
Kurtosis5597.0123
Mean1.0544507
Median Absolute Deviation (MAD)0
Skewness58.117773
Sum51928.536
Variance392.84323
MonotonicityNot monotonic
2023-05-08T00:15:58.842647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 47593
96.6%
0.003 6
 
< 0.1%
0.002 5
 
< 0.1%
0.005 5
 
< 0.1%
0.019 3
 
< 0.1%
0.883 3
 
< 0.1%
0.035 3
 
< 0.1%
1.436 3
 
< 0.1%
14.797 3
 
< 0.1%
0.762 3
 
< 0.1%
Other values (1551) 1620
 
3.3%
ValueCountFrequency (%)
-805.309 1
< 0.1%
-80.528 1
< 0.1%
-50.396 1
< 0.1%
-31.567 1
< 0.1%
-28.603 1
< 0.1%
-25.922 1
< 0.1%
-20.189 1
< 0.1%
-13.348 1
< 0.1%
-8.872 1
< 0.1%
-7.633 1
< 0.1%
ValueCountFrequency (%)
2420 1
< 0.1%
1254.329 1
< 0.1%
1060.724 1
< 0.1%
1045.873 1
< 0.1%
969.203 1
< 0.1%
913.982 1
< 0.1%
775.329 1
< 0.1%
680.55 1
< 0.1%
627.545 1
< 0.1%
621.999 1
< 0.1%

OTHCOMP
Real number (ℝ)

SKEWED  ZEROS 

Distinct8240
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.793034
Minimum-571.2
Maximum47502.388
Zeros31272
Zeros (%)63.5%
Negative7
Negative (%)< 0.1%
Memory size384.9 KiB
2023-05-08T00:15:58.943618image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-571.2
5-th percentile0
Q10
median0
Q34.6945
95-th percentile38.1037
Maximum47502.388
Range48073.588
Interquartile range (IQR)4.6945

Descriptive statistics

Standard deviation249.2467
Coefficient of variation (CV)18.070476
Kurtosis26937.585
Mean13.793034
Median Absolute Deviation (MAD)0
Skewness146.20047
Sum679265.56
Variance62123.917
MonotonicityNot monotonic
2023-05-08T00:15:59.038761image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 31272
63.5%
10 1283
 
2.6%
5 730
 
1.5%
20 399
 
0.8%
15 365
 
0.7%
7.5 212
 
0.4%
25 196
 
0.4%
2.5 144
 
0.3%
50 137
 
0.3%
30 133
 
0.3%
Other values (8230) 14376
29.2%
ValueCountFrequency (%)
-571.2 1
 
< 0.1%
-252 1
 
< 0.1%
-44.836 1
 
< 0.1%
-41.801 1
 
< 0.1%
-41.548 1
 
< 0.1%
-41.092 1
 
< 0.1%
-26.392 1
 
< 0.1%
0 31272
63.5%
0.001 9
 
< 0.1%
0.002 2
 
< 0.1%
ValueCountFrequency (%)
47502.388 1
< 0.1%
10868.464 1
< 0.1%
9500 1
< 0.1%
7688.89 1
< 0.1%
7566.196 1
< 0.1%
5265.101 1
< 0.1%
5071.726 1
< 0.1%
5000 1
< 0.1%
4673.611 1
< 0.1%
4647.183 1
< 0.1%

TOTAL_SEC
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct31751
Distinct (%)64.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean306.38494
Minimum-0.002
Maximum1927510.7
Zeros1693
Zeros (%)3.4%
Negative1
Negative (%)< 0.1%
Memory size384.9 KiB
2023-05-08T00:15:59.137239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-0.002
5-th percentile25.5501
Q1199.673
median253.147
Q3302.88
95-th percentile437.2434
Maximum1927510.7
Range1927510.7
Interquartile range (IQR)103.207

Descriptive statistics

Standard deviation8696.5284
Coefficient of variation (CV)28.384321
Kurtosis48975.969
Mean306.38494
Median Absolute Deviation (MAD)52.145
Skewness221.00818
Sum15088539
Variance75629606
MonotonicityNot monotonic
2023-05-08T00:15:59.238626image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1693
 
3.4%
250 177
 
0.4%
240 166
 
0.3%
300 155
 
0.3%
260 154
 
0.3%
200 142
 
0.3%
275 138
 
0.3%
220 134
 
0.3%
280 126
 
0.3%
270 125
 
0.3%
Other values (31741) 46237
93.9%
ValueCountFrequency (%)
-0.002 1
 
< 0.1%
0 1693
3.4%
0.001 2
 
< 0.1%
0.023 1
 
< 0.1%
0.259 1
 
< 0.1%
0.299 1
 
< 0.1%
0.5 1
 
< 0.1%
0.513 1
 
< 0.1%
0.53 1
 
< 0.1%
0.563 3
 
< 0.1%
ValueCountFrequency (%)
1927510.712 1
< 0.1%
47502.388 1
< 0.1%
43682.359 1
< 0.1%
30805.26 1
< 0.1%
23207.558 1
< 0.1%
20514.945 1
< 0.1%
17735.283 1
< 0.1%
14832.532 1
< 0.1%
13430.035 1
< 0.1%
13323.658 1
< 0.1%

YEAR
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.6417
Minimum2010
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:15:59.327695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12012
median2015
Q32017
95-th percentile2019
Maximum2019
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.8685722
Coefficient of variation (CV)0.0014238622
Kurtosis-1.2164314
Mean2014.6417
Median Absolute Deviation (MAD)2
Skewness-0.060738892
Sum99215060
Variance8.2287063
MonotonicityNot monotonic
2023-05-08T00:15:59.391350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2019 5310
10.8%
2018 5232
10.6%
2016 5096
10.3%
2017 5068
10.3%
2015 5000
10.2%
2014 4868
9.9%
2013 4812
9.8%
2012 4676
9.5%
2011 4632
9.4%
2010 4553
9.2%
ValueCountFrequency (%)
2010 4553
9.2%
2011 4632
9.4%
2012 4676
9.5%
2013 4812
9.8%
2014 4868
9.9%
2015 5000
10.2%
2016 5096
10.3%
2017 5068
10.3%
2018 5232
10.6%
2019 5310
10.8%
ValueCountFrequency (%)
2019 5310
10.8%
2018 5232
10.6%
2017 5068
10.3%
2016 5096
10.3%
2015 5000
10.2%
2014 4868
9.9%
2013 4812
9.8%
2012 4676
9.5%
2011 4632
9.4%
2010 4553
9.2%

CONAME
Categorical

Distinct495
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
CME GROUP INC
 
264
TRUIST FINANCIAL CORP
 
179
GENERAL ELECTRIC CO
 
170
WELLS FARGO & CO
 
161
BANK OF AMERICA CORP
 
157
Other values (490)
48316 

Length

Max length28
Median length20
Mean length17.177757
Min length5

Characters and Unicode

Total characters845953
Distinct characters37
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAMERICAN AIRLINES GROUP INC
2nd rowAMERICAN AIRLINES GROUP INC
3rd rowAMERICAN AIRLINES GROUP INC
4th rowAMERICAN AIRLINES GROUP INC
5th rowAMERICAN AIRLINES GROUP INC

Common Values

ValueCountFrequency (%)
CME GROUP INC 264
 
0.5%
TRUIST FINANCIAL CORP 179
 
0.4%
GENERAL ELECTRIC CO 170
 
0.3%
WELLS FARGO & CO 161
 
0.3%
BANK OF AMERICA CORP 157
 
0.3%
PNC FINANCIAL SVCS GROUP INC 156
 
0.3%
COCA-COLA CO 154
 
0.3%
CITIGROUP INC 153
 
0.3%
US BANCORP 153
 
0.3%
CBOE GLOBAL MARKETS INC 152
 
0.3%
Other values (485) 47548
96.6%

Length

2023-05-08T00:15:59.473751image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
inc 24191
 
17.4%
corp 11521
 
8.3%
co 5348
 
3.9%
group 2676
 
1.9%
2364
 
1.7%
energy 1886
 
1.4%
financial 1650
 
1.2%
plc 1448
 
1.0%
technologies 1119
 
0.8%
holdings 1054
 
0.8%
Other values (699) 85410
61.6%

Most occurring characters

ValueCountFrequency (%)
89696
10.6%
C 75849
 
9.0%
N 74174
 
8.8%
I 68352
 
8.1%
E 65656
 
7.8%
O 63991
 
7.6%
R 61254
 
7.2%
A 52302
 
6.2%
S 40454
 
4.8%
T 40376
 
4.8%
Other values (27) 213849
25.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 747222
88.3%
Space Separator 89696
 
10.6%
Other Punctuation 3799
 
0.4%
Dash Punctuation 1935
 
0.2%
Close Punctuation 1433
 
0.2%
Open Punctuation 1433
 
0.2%
Decimal Number 435
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 75849
10.2%
N 74174
9.9%
I 68352
9.1%
E 65656
 
8.8%
O 63991
 
8.6%
R 61254
 
8.2%
A 52302
 
7.0%
S 40454
 
5.4%
T 40376
 
5.4%
L 38277
 
5.1%
Other values (16) 166537
22.3%
Other Punctuation
ValueCountFrequency (%)
& 2748
72.3%
' 484
 
12.7%
. 392
 
10.3%
/ 175
 
4.6%
Decimal Number
ValueCountFrequency (%)
3 226
52.0%
6 128
29.4%
5 81
 
18.6%
Space Separator
ValueCountFrequency (%)
89696
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1935
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1433
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1433
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 747222
88.3%
Common 98731
 
11.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 75849
10.2%
N 74174
9.9%
I 68352
9.1%
E 65656
 
8.8%
O 63991
 
8.6%
R 61254
 
8.2%
A 52302
 
7.0%
S 40454
 
5.4%
T 40376
 
5.4%
L 38277
 
5.1%
Other values (16) 166537
22.3%
Common
ValueCountFrequency (%)
89696
90.8%
& 2748
 
2.8%
- 1935
 
2.0%
) 1433
 
1.5%
( 1433
 
1.5%
' 484
 
0.5%
. 392
 
0.4%
3 226
 
0.2%
/ 175
 
0.2%
6 128
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 845953
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
89696
10.6%
C 75849
 
9.0%
N 74174
 
8.8%
I 68352
 
8.1%
E 65656
 
7.8%
O 63991
 
7.6%
R 61254
 
7.2%
A 52302
 
6.2%
S 40454
 
4.8%
T 40376
 
4.8%
Other values (27) 213849
25.3%

CUSIP
Categorical

Distinct495
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
12572Q10
 
264
89832Q10
 
179
36960430
 
170
94974610
 
161
06050510
 
157
Other values (490)
48316 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters393976
Distinct characters33
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row02376R10
2nd row02376R10
3rd row02376R10
4th row02376R10
5th row02376R10

Common Values

ValueCountFrequency (%)
12572Q10 264
 
0.5%
89832Q10 179
 
0.4%
36960430 170
 
0.3%
94974610 161
 
0.3%
06050510 157
 
0.3%
69347510 156
 
0.3%
19121610 154
 
0.3%
17296742 153
 
0.3%
90297330 153
 
0.3%
12503M10 152
 
0.3%
Other values (485) 47548
96.6%

Length

2023-05-08T00:15:59.551169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
12572q10 264
 
0.5%
89832q10 179
 
0.4%
36960430 170
 
0.3%
94974610 161
 
0.3%
06050510 157
 
0.3%
69347510 156
 
0.3%
19121610 154
 
0.3%
17296742 153
 
0.3%
90297330 153
 
0.3%
12503m10 152
 
0.3%
Other values (485) 47548
96.6%

Most occurring characters

ValueCountFrequency (%)
0 80954
20.5%
1 75076
19.1%
4 30261
 
7.7%
2 29169
 
7.4%
6 28519
 
7.2%
3 28498
 
7.2%
5 27958
 
7.1%
7 25370
 
6.4%
8 23851
 
6.1%
9 22850
 
5.8%
Other values (23) 21470
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 372506
94.6%
Uppercase Letter 21470
 
5.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G 2671
 
12.4%
L 1577
 
7.3%
R 1305
 
6.1%
C 1272
 
5.9%
V 1266
 
5.9%
E 1187
 
5.5%
P 1172
 
5.5%
T 1131
 
5.3%
H 1090
 
5.1%
Q 1068
 
5.0%
Other values (13) 7731
36.0%
Decimal Number
ValueCountFrequency (%)
0 80954
21.7%
1 75076
20.2%
4 30261
 
8.1%
2 29169
 
7.8%
6 28519
 
7.7%
3 28498
 
7.7%
5 27958
 
7.5%
7 25370
 
6.8%
8 23851
 
6.4%
9 22850
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 372506
94.6%
Latin 21470
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 2671
 
12.4%
L 1577
 
7.3%
R 1305
 
6.1%
C 1272
 
5.9%
V 1266
 
5.9%
E 1187
 
5.5%
P 1172
 
5.5%
T 1131
 
5.3%
H 1090
 
5.1%
Q 1068
 
5.0%
Other values (13) 7731
36.0%
Common
ValueCountFrequency (%)
0 80954
21.7%
1 75076
20.2%
4 30261
 
8.1%
2 29169
 
7.8%
6 28519
 
7.7%
3 28498
 
7.7%
5 27958
 
7.5%
7 25370
 
6.8%
8 23851
 
6.4%
9 22850
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 393976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 80954
20.5%
1 75076
19.1%
4 30261
 
7.7%
2 29169
 
7.4%
6 28519
 
7.2%
3 28498
 
7.2%
5 27958
 
7.1%
7 25370
 
6.4%
8 23851
 
6.1%
9 22850
 
5.8%
Other values (23) 21470
 
5.4%

EXCHANGE
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
NYS
35321 
NAS
13774 
OTH
 
152

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters147741
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNAS
2nd rowNAS
3rd rowNAS
4th rowNAS
5th rowNAS

Common Values

ValueCountFrequency (%)
NYS 35321
71.7%
NAS 13774
 
28.0%
OTH 152
 
0.3%

Length

2023-05-08T00:15:59.619701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-08T00:15:59.695284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
nys 35321
71.7%
nas 13774
 
28.0%
oth 152
 
0.3%

Most occurring characters

ValueCountFrequency (%)
N 49095
33.2%
S 49095
33.2%
Y 35321
23.9%
A 13774
 
9.3%
O 152
 
0.1%
T 152
 
0.1%
H 152
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 147741
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 49095
33.2%
S 49095
33.2%
Y 35321
23.9%
A 13774
 
9.3%
O 152
 
0.1%
T 152
 
0.1%
H 152
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 147741
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 49095
33.2%
S 49095
33.2%
Y 35321
23.9%
A 13774
 
9.3%
O 152
 
0.1%
T 152
 
0.1%
H 152
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 147741
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 49095
33.2%
S 49095
33.2%
Y 35321
23.9%
A 13774
 
9.3%
O 152
 
0.1%
T 152
 
0.1%
H 152
 
0.1%

ADDRESS
Categorical

Distinct492
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
20 South Wacker Drive
 
264
One Energy Plaza
 
236
One PPG Place
 
186
214 North Tryon Street
 
179
5 Necco Street
 
170
Other values (487)
48212 

Length

Max length60
Median length48
Mean length24.314638
Min length9

Characters and Unicode

Total characters1197423
Distinct characters68
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1 Skyview Drive
2nd row1 Skyview Drive
3rd row1 Skyview Drive
4th row1 Skyview Drive
5th row1 Skyview Drive

Common Values

ValueCountFrequency (%)
20 South Wacker Drive 264
 
0.5%
One Energy Plaza 236
 
0.5%
One PPG Place 186
 
0.4%
214 North Tryon Street 179
 
0.4%
5 Necco Street 170
 
0.3%
420 Montgomery Street 161
 
0.3%
Bank of America Corporate Center, 100 North Tryon Street 157
 
0.3%
The Tower at PNC Plaza, 300 Fifth Avenue 156
 
0.3%
One Coca-Cola Plaza 154
 
0.3%
800 Nicollet Mall 153
 
0.3%
Other values (482) 47431
96.3%

Length

2023-05-08T00:15:59.790471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
street 11044
 
5.4%
avenue 7266
 
3.5%
suite 6537
 
3.2%
drive 6142
 
3.0%
road 5826
 
2.8%
one 4628
 
2.3%
boulevard 4518
 
2.2%
south 4003
 
2.0%
west 3809
 
1.9%
north 2910
 
1.4%
Other values (894) 148384
72.4%

Most occurring characters

ValueCountFrequency (%)
155820
 
13.0%
e 113033
 
9.4%
t 67341
 
5.6%
a 67023
 
5.6%
r 65614
 
5.5%
0 55120
 
4.6%
o 53405
 
4.5%
n 48417
 
4.0%
i 41680
 
3.5%
l 32464
 
2.7%
Other values (58) 497506
41.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 685228
57.2%
Decimal Number 183067
 
15.3%
Space Separator 155820
 
13.0%
Uppercase Letter 154846
 
12.9%
Other Punctuation 17924
 
1.5%
Dash Punctuation 538
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 26839
17.3%
P 15145
 
9.8%
B 11935
 
7.7%
A 11189
 
7.2%
C 10673
 
6.9%
W 10397
 
6.7%
O 8123
 
5.2%
D 8096
 
5.2%
R 7967
 
5.1%
N 6508
 
4.2%
Other values (16) 37974
24.5%
Lowercase Letter
ValueCountFrequency (%)
e 113033
16.5%
t 67341
9.8%
a 67023
9.8%
r 65614
9.6%
o 53405
 
7.8%
n 48417
 
7.1%
i 41680
 
6.1%
l 32464
 
4.7%
u 32040
 
4.7%
s 24356
 
3.6%
Other values (15) 139855
20.4%
Decimal Number
ValueCountFrequency (%)
0 55120
30.1%
1 31333
17.1%
5 20074
 
11.0%
2 17849
 
9.7%
3 13603
 
7.4%
4 10083
 
5.5%
7 9729
 
5.3%
9 8625
 
4.7%
6 8595
 
4.7%
8 8056
 
4.4%
Other Punctuation
ValueCountFrequency (%)
, 15143
84.5%
. 1844
 
10.3%
& 373
 
2.1%
' 342
 
1.9%
/ 222
 
1.2%
Space Separator
ValueCountFrequency (%)
155820
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 538
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 840074
70.2%
Common 357349
29.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 113033
 
13.5%
t 67341
 
8.0%
a 67023
 
8.0%
r 65614
 
7.8%
o 53405
 
6.4%
n 48417
 
5.8%
i 41680
 
5.0%
l 32464
 
3.9%
u 32040
 
3.8%
S 26839
 
3.2%
Other values (41) 292218
34.8%
Common
ValueCountFrequency (%)
155820
43.6%
0 55120
 
15.4%
1 31333
 
8.8%
5 20074
 
5.6%
2 17849
 
5.0%
, 15143
 
4.2%
3 13603
 
3.8%
4 10083
 
2.8%
7 9729
 
2.7%
9 8625
 
2.4%
Other values (7) 19970
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1197423
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
155820
 
13.0%
e 113033
 
9.4%
t 67341
 
5.6%
a 67023
 
5.6%
r 65614
 
5.5%
0 55120
 
4.6%
o 53405
 
4.5%
n 48417
 
4.0%
i 41680
 
3.5%
l 32464
 
2.7%
Other values (58) 497506
41.5%

CITY
Categorical

Distinct229
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
New York
3958 
Houston
 
1860
Atlanta
 
1722
Chicago
 
1705
Dallas
 
1103
Other values (224)
38899 

Length

Max length16
Median length14
Mean length8.4628708
Min length4

Characters and Unicode

Total characters416771
Distinct characters49
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFort Worth
2nd rowFort Worth
3rd rowFort Worth
4th rowFort Worth
5th rowFort Worth

Common Values

ValueCountFrequency (%)
New York 3958
 
8.0%
Houston 1860
 
3.8%
Atlanta 1722
 
3.5%
Chicago 1705
 
3.5%
Dallas 1103
 
2.2%
Dublin 1037
 
2.1%
Charlotte 980
 
2.0%
San Jose 766
 
1.6%
Boston 746
 
1.5%
Santa Clara 708
 
1.4%
Other values (219) 34662
70.4%

Length

2023-05-08T00:15:59.887102image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
new 4292
 
6.9%
york 3958
 
6.3%
san 2556
 
4.1%
chicago 1892
 
3.0%
houston 1860
 
3.0%
atlanta 1722
 
2.7%
dallas 1103
 
1.8%
dublin 1037
 
1.7%
charlotte 980
 
1.6%
santa 842
 
1.3%
Other values (247) 42405
67.7%

Most occurring characters

ValueCountFrequency (%)
a 37403
 
9.0%
o 35557
 
8.5%
n 34047
 
8.2%
e 32721
 
7.9%
i 26102
 
6.3%
l 24276
 
5.8%
t 23723
 
5.7%
r 21971
 
5.3%
s 18307
 
4.4%
13400
 
3.2%
Other values (39) 149264
35.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 340507
81.7%
Uppercase Letter 62864
 
15.1%
Space Separator 13400
 
3.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 37403
11.0%
o 35557
10.4%
n 34047
10.0%
e 32721
9.6%
i 26102
 
7.7%
l 24276
 
7.1%
t 23723
 
7.0%
r 21971
 
6.5%
s 18307
 
5.4%
h 11256
 
3.3%
Other values (15) 75144
22.1%
Uppercase Letter
ValueCountFrequency (%)
C 7577
12.1%
S 6535
 
10.4%
N 5596
 
8.9%
M 4444
 
7.1%
D 4302
 
6.8%
Y 3958
 
6.3%
B 3928
 
6.2%
A 3660
 
5.8%
P 2881
 
4.6%
H 2838
 
4.5%
Other values (13) 17145
27.3%
Space Separator
ValueCountFrequency (%)
13400
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 403371
96.8%
Common 13400
 
3.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 37403
 
9.3%
o 35557
 
8.8%
n 34047
 
8.4%
e 32721
 
8.1%
i 26102
 
6.5%
l 24276
 
6.0%
t 23723
 
5.9%
r 21971
 
5.4%
s 18307
 
4.5%
h 11256
 
2.8%
Other values (38) 138008
34.2%
Common
ValueCountFrequency (%)
13400
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 416771
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 37403
 
9.0%
o 35557
 
8.5%
n 34047
 
8.2%
e 32721
 
7.9%
i 26102
 
6.3%
l 24276
 
5.8%
t 23723
 
5.7%
r 21971
 
5.3%
s 18307
 
4.4%
13400
 
3.2%
Other values (39) 149264
35.8%

STATE
Categorical

HIGH CORRELATION  MISSING 

Distinct38
Distinct (%)0.1%
Missing2082
Missing (%)4.2%
Memory size384.9 KiB
CA
6191 
NY
5187 
TX
4389 
IL
3375 
OH
 
2336
Other values (33)
25687 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters94330
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTX
2nd rowTX
3rd rowTX
4th rowTX
5th rowTX

Common Values

ValueCountFrequency (%)
CA 6191
 
12.6%
NY 5187
 
10.5%
TX 4389
 
8.9%
IL 3375
 
6.9%
OH 2336
 
4.7%
MA 2049
 
4.2%
PA 1951
 
4.0%
GA 1920
 
3.9%
VA 1769
 
3.6%
NC 1666
 
3.4%
Other values (28) 16332
33.2%
(Missing) 2082
 
4.2%

Length

2023-05-08T00:15:59.961527image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca 6191
 
13.1%
ny 5187
 
11.0%
tx 4389
 
9.3%
il 3375
 
7.2%
oh 2336
 
5.0%
ma 2049
 
4.3%
pa 1951
 
4.1%
ga 1920
 
4.1%
va 1769
 
3.8%
nc 1666
 
3.5%
Other values (28) 16332
34.6%

Most occurring characters

ValueCountFrequency (%)
A 16740
17.7%
N 11859
12.6%
C 10109
10.7%
I 6553
 
6.9%
T 6518
 
6.9%
M 5845
 
6.2%
Y 5499
 
5.8%
L 5443
 
5.8%
X 4389
 
4.7%
O 4240
 
4.5%
Other values (13) 17135
18.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 94330
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 16740
17.7%
N 11859
12.6%
C 10109
10.7%
I 6553
 
6.9%
T 6518
 
6.9%
M 5845
 
6.2%
Y 5499
 
5.8%
L 5443
 
5.8%
X 4389
 
4.7%
O 4240
 
4.5%
Other values (13) 17135
18.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 94330
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 16740
17.7%
N 11859
12.6%
C 10109
10.7%
I 6553
 
6.9%
T 6518
 
6.9%
M 5845
 
6.2%
Y 5499
 
5.8%
L 5443
 
5.8%
X 4389
 
4.7%
O 4240
 
4.5%
Other values (13) 17135
18.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 94330
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 16740
17.7%
N 11859
12.6%
C 10109
10.7%
I 6553
 
6.9%
T 6518
 
6.9%
M 5845
 
6.2%
Y 5499
 
5.8%
L 5443
 
5.8%
X 4389
 
4.7%
O 4240
 
4.5%
Other values (13) 17135
18.2%

ZIP
Categorical

Distinct384
Distinct (%)0.8%
Missing113
Missing (%)0.2%
Memory size384.9 KiB
10036
 
913
60606
 
530
95054
 
524
77002
 
524
30328
 
473
Other values (379)
46170 

Length

Max length5
Median length5
Mean length4.9741523
Min length1

Characters and Unicode

Total characters244400
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row76155
2nd row76155
3rd row76155
4th row76155
5th row76155

Common Values

ValueCountFrequency (%)
10036 913
 
1.9%
60606 530
 
1.1%
95054 524
 
1.1%
77002 524
 
1.1%
30328 473
 
1.0%
10001 467
 
0.9%
60015 443
 
0.9%
28202 435
 
0.9%
75039 419
 
0.9%
20190 412
 
0.8%
Other values (374) 43994
89.3%

Length

2023-05-08T00:16:00.044632image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10036 913
 
1.8%
60606 530
 
1.1%
95054 524
 
1.0%
77002 524
 
1.0%
30328 473
 
0.9%
10001 467
 
0.9%
d02 462
 
0.9%
60015 443
 
0.9%
28202 435
 
0.9%
75039 419
 
0.8%
Other values (379) 44937
89.6%

Most occurring characters

ValueCountFrequency (%)
0 50247
20.6%
1 32535
13.3%
2 32133
13.1%
3 21339
8.7%
4 20382
8.3%
5 19324
 
7.9%
6 17285
 
7.1%
7 17268
 
7.1%
9 15157
 
6.2%
8 14994
 
6.1%
Other values (17) 3736
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 240664
98.5%
Uppercase Letter 2743
 
1.1%
Space Separator 993
 
0.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 711
25.9%
M 283
 
10.3%
T 218
 
7.9%
H 168
 
6.1%
V 143
 
5.2%
Y 126
 
4.6%
K 123
 
4.5%
C 115
 
4.2%
E 115
 
4.2%
P 114
 
4.2%
Other values (6) 627
22.9%
Decimal Number
ValueCountFrequency (%)
0 50247
20.9%
1 32535
13.5%
2 32133
13.4%
3 21339
8.9%
4 20382
8.5%
5 19324
 
8.0%
6 17285
 
7.2%
7 17268
 
7.2%
9 15157
 
6.3%
8 14994
 
6.2%
Space Separator
ValueCountFrequency (%)
993
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 241657
98.9%
Latin 2743
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 711
25.9%
M 283
 
10.3%
T 218
 
7.9%
H 168
 
6.1%
V 143
 
5.2%
Y 126
 
4.6%
K 123
 
4.5%
C 115
 
4.2%
E 115
 
4.2%
P 114
 
4.2%
Other values (6) 627
22.9%
Common
ValueCountFrequency (%)
0 50247
20.8%
1 32535
13.5%
2 32133
13.3%
3 21339
8.8%
4 20382
8.4%
5 19324
 
8.0%
6 17285
 
7.2%
7 17268
 
7.1%
9 15157
 
6.3%
8 14994
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 244400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 50247
20.6%
1 32535
13.3%
2 32133
13.1%
3 21339
8.7%
4 20382
8.3%
5 19324
 
7.9%
6 17285
 
7.1%
7 17268
 
7.1%
9 15157
 
6.2%
8 14994
 
6.1%
Other values (17) 3736
 
1.5%

SICDESC
Categorical

Distinct178
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
COMMERCIAL BANKS
 
2562
REAL ESTATE INVESTMENT TRUSTS
 
2422
COMPUTER PROGRAMMING, DATA PROCESSING, AND OTHER C
 
1741
ELECTRIC AND OTHER SERVICES COMBINED
 
1606
SEMICONDUCTORS AND RELATED DEVICES
 
1472
Other values (173)
39444 

Length

Max length50
Median length44
Mean length31.699961
Min length8

Characters and Unicode

Total characters1561128
Distinct characters35
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAIR TRANSPORTATION, SCHEDULED
2nd rowAIR TRANSPORTATION, SCHEDULED
3rd rowAIR TRANSPORTATION, SCHEDULED
4th rowAIR TRANSPORTATION, SCHEDULED
5th rowAIR TRANSPORTATION, SCHEDULED

Common Values

ValueCountFrequency (%)
COMMERCIAL BANKS 2562
 
5.2%
REAL ESTATE INVESTMENT TRUSTS 2422
 
4.9%
COMPUTER PROGRAMMING, DATA PROCESSING, AND OTHER C 1741
 
3.5%
ELECTRIC AND OTHER SERVICES COMBINED 1606
 
3.3%
SEMICONDUCTORS AND RELATED DEVICES 1472
 
3.0%
ELECTRIC SERVICES 1376
 
2.8%
FIRE, MARINE, AND CASUALTY INSURANCE 1218
 
2.5%
PHARMACEUTICAL PREPARATIONS 1162
 
2.4%
PREPACKAGED SOFTWARE 1053
 
2.1%
CRUDE PETROLEUM AND NATURAL GAS 922
 
1.9%
Other values (168) 33713
68.5%

Length

2023-05-08T00:16:00.150886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 22505
 
11.4%
services 5717
 
2.9%
other 4525
 
2.3%
4052
 
2.1%
computer 3443
 
1.7%
investment 3207
 
1.6%
electric 3082
 
1.6%
commercial 3061
 
1.5%
insurance 2576
 
1.3%
banks 2562
 
1.3%
Other values (368) 142837
72.3%

Most occurring characters

ValueCountFrequency (%)
E 167072
10.7%
148320
 
9.5%
A 132419
 
8.5%
R 115745
 
7.4%
S 109167
 
7.0%
I 103379
 
6.6%
N 102936
 
6.6%
T 102786
 
6.6%
C 88575
 
5.7%
O 76701
 
4.9%
Other values (25) 414028
26.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1382178
88.5%
Space Separator 148320
 
9.5%
Other Punctuation 28753
 
1.8%
Dash Punctuation 1521
 
0.1%
Open Punctuation 178
 
< 0.1%
Close Punctuation 178
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 167072
12.1%
A 132419
9.6%
R 115745
 
8.4%
S 109167
 
7.9%
I 103379
 
7.5%
N 102936
 
7.4%
T 102786
 
7.4%
C 88575
 
6.4%
O 76701
 
5.5%
D 61223
 
4.4%
Other values (16) 322175
23.3%
Other Punctuation
ValueCountFrequency (%)
, 23958
83.3%
& 4052
 
14.1%
' 437
 
1.5%
; 233
 
0.8%
: 73
 
0.3%
Space Separator
ValueCountFrequency (%)
148320
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1521
100.0%
Open Punctuation
ValueCountFrequency (%)
( 178
100.0%
Close Punctuation
ValueCountFrequency (%)
) 178
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1382178
88.5%
Common 178950
 
11.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 167072
12.1%
A 132419
9.6%
R 115745
 
8.4%
S 109167
 
7.9%
I 103379
 
7.5%
N 102936
 
7.4%
T 102786
 
7.4%
C 88575
 
6.4%
O 76701
 
5.5%
D 61223
 
4.4%
Other values (16) 322175
23.3%
Common
ValueCountFrequency (%)
148320
82.9%
, 23958
 
13.4%
& 4052
 
2.3%
- 1521
 
0.8%
' 437
 
0.2%
; 233
 
0.1%
( 178
 
0.1%
) 178
 
0.1%
: 73
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1561128
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 167072
10.7%
148320
 
9.5%
A 132419
 
8.5%
R 115745
 
7.4%
S 109167
 
7.0%
I 103379
 
6.6%
N 102936
 
6.6%
T 102786
 
6.6%
C 88575
 
5.7%
O 76701
 
4.9%
Other values (25) 414028
26.5%

NAICSDESC
Categorical

Distinct203
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
Commercial Banking
 
2596
Electric Power Generation
 
1629
Semiconductor and Related Device Manufacturing
 
1472
Electric Power Generation, Transmission and Distri
 
1466
Lessors of Nonresidential Buildings (except Miniwa
 
1333
Other values (198)
40751 

Length

Max length50
Median length46
Mean length38.158568
Min length9

Characters and Unicode

Total characters1879195
Distinct characters66
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowScheduled Passenger Air Transportation
2nd rowScheduled Passenger Air Transportation
3rd rowScheduled Passenger Air Transportation
4th rowScheduled Passenger Air Transportation
5th rowScheduled Passenger Air Transportation

Common Values

ValueCountFrequency (%)
Commercial Banking 2596
 
5.3%
Electric Power Generation 1629
 
3.3%
Semiconductor and Related Device Manufacturing 1472
 
3.0%
Electric Power Generation, Transmission and Distri 1466
 
3.0%
Lessors of Nonresidential Buildings (except Miniwa 1333
 
2.7%
Direct Property and Casualty Insurance Carriers 1269
 
2.6%
Pharmaceutical Preparation Manufacturing 1162
 
2.4%
Oil and Gas Extraction 928
 
1.9%
Data Processing, Hosting, and Related Services (ef 863
 
1.8%
Lessors of Residential Buildings and Dwellings 863
 
1.8%
Other values (193) 35666
72.4%

Length

2023-05-08T00:16:00.256464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 25531
 
11.1%
manufacturing 12645
 
5.5%
other 4591
 
2.0%
power 3207
 
1.4%
electric 3095
 
1.3%
generation 3095
 
1.3%
eff 3040
 
1.3%
banking 3020
 
1.3%
insurance 2998
 
1.3%
related 2903
 
1.3%
Other values (434) 166069
72.1%

Most occurring characters

ValueCountFrequency (%)
181307
 
9.6%
e 163260
 
8.7%
a 151970
 
8.1%
n 143177
 
7.6%
r 136309
 
7.3%
i 133443
 
7.1%
t 112006
 
6.0%
s 81020
 
4.3%
o 78884
 
4.2%
c 74965
 
4.0%
Other values (56) 622854
33.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1461080
77.8%
Uppercase Letter 187665
 
10.0%
Space Separator 181307
 
9.6%
Decimal Number 17663
 
0.9%
Other Punctuation 15144
 
0.8%
Open Punctuation 9538
 
0.5%
Close Punctuation 2546
 
0.1%
Dash Punctuation 2351
 
0.1%
Connector Punctuation 1901
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 163260
11.2%
a 151970
10.4%
n 143177
9.8%
r 136309
9.3%
i 133443
9.1%
t 112006
 
7.7%
s 81020
 
5.5%
o 78884
 
5.4%
c 74965
 
5.1%
u 67632
 
4.6%
Other values (16) 318414
21.8%
Uppercase Letter
ValueCountFrequency (%)
M 24961
13.3%
P 19843
10.6%
S 19550
10.4%
C 18171
9.7%
D 14041
 
7.5%
A 10851
 
5.8%
E 10215
 
5.4%
B 9624
 
5.1%
I 8494
 
4.5%
G 8047
 
4.3%
Other values (14) 43868
23.4%
Decimal Number
ValueCountFrequency (%)
2 5545
31.4%
0 3317
18.8%
6 3046
17.2%
1 2621
14.8%
5 1377
 
7.8%
4 1052
 
6.0%
3 513
 
2.9%
8 192
 
1.1%
Other Punctuation
ValueCountFrequency (%)
, 9198
60.7%
/ 5561
36.7%
' 385
 
2.5%
Space Separator
ValueCountFrequency (%)
181307
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9538
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2546
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2351
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1901
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1648745
87.7%
Common 230450
 
12.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 163260
 
9.9%
a 151970
 
9.2%
n 143177
 
8.7%
r 136309
 
8.3%
i 133443
 
8.1%
t 112006
 
6.8%
s 81020
 
4.9%
o 78884
 
4.8%
c 74965
 
4.5%
u 67632
 
4.1%
Other values (40) 506079
30.7%
Common
ValueCountFrequency (%)
181307
78.7%
( 9538
 
4.1%
, 9198
 
4.0%
/ 5561
 
2.4%
2 5545
 
2.4%
0 3317
 
1.4%
6 3046
 
1.3%
1 2621
 
1.1%
) 2546
 
1.1%
- 2351
 
1.0%
Other values (6) 5420
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1879195
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
181307
 
9.6%
e 163260
 
8.7%
a 151970
 
8.1%
n 143177
 
7.6%
r 136309
 
7.3%
i 133443
 
7.1%
t 112006
 
6.0%
s 81020
 
4.3%
o 78884
 
4.2%
c 74965
 
4.0%
Other values (56) 622854
33.1%

INDDESC
Categorical

Distinct122
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
Electric Utilities
 
1970
Regional Banks
 
1602
Health Care Equipment
 
1458
Semiconductors
 
1366
Packaged Foods & Meats
 
1259
Other values (117)
41592 

Length

Max length44
Median length32
Mean length21.542124
Min length4

Characters and Unicode

Total characters1060885
Distinct characters48
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAirlines
2nd rowAirlines
3rd rowAirlines
4th rowAirlines
5th rowAirlines

Common Values

ValueCountFrequency (%)
Electric Utilities 1970
 
4.0%
Regional Banks 1602
 
3.3%
Health Care Equipment 1458
 
3.0%
Semiconductors 1366
 
2.8%
Packaged Foods & Meats 1259
 
2.6%
Financial Exchanges & Data 1134
 
2.3%
Multi-Utilities 1122
 
2.3%
Aerospace & Defense 1113
 
2.3%
Life Sciences Tools & Services 1023
 
2.1%
Industrial Machinery 993
 
2.0%
Other values (112) 36207
73.5%

Length

2023-05-08T00:16:00.361244image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
20722
 
14.7%
services 5186
 
3.7%
health 4233
 
3.0%
equipment 3592
 
2.6%
care 3547
 
2.5%
banks 3250
 
2.3%
reits 2520
 
1.8%
insurance 2510
 
1.8%
gas 2298
 
1.6%
oil 2187
 
1.6%
Other values (175) 90592
64.4%

Most occurring characters

ValueCountFrequency (%)
e 105034
 
9.9%
91390
 
8.6%
i 80659
 
7.6%
t 71524
 
6.7%
a 71307
 
6.7%
s 69041
 
6.5%
r 61659
 
5.8%
n 59784
 
5.6%
o 49069
 
4.6%
c 46157
 
4.4%
Other values (38) 355261
33.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 816369
77.0%
Uppercase Letter 129289
 
12.2%
Space Separator 91390
 
8.6%
Other Punctuation 22259
 
2.1%
Dash Punctuation 1578
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 105034
12.9%
i 80659
9.9%
t 71524
8.8%
a 71307
8.7%
s 69041
8.5%
r 61659
 
7.6%
n 59784
 
7.3%
o 49069
 
6.0%
c 46157
 
5.7%
l 43205
 
5.3%
Other values (15) 158930
19.5%
Uppercase Letter
ValueCountFrequency (%)
S 15103
11.7%
C 12851
9.9%
E 12680
9.8%
I 10110
 
7.8%
R 9329
 
7.2%
P 8650
 
6.7%
H 8074
 
6.2%
M 7984
 
6.2%
T 6665
 
5.2%
A 6601
 
5.1%
Other values (9) 31242
24.2%
Other Punctuation
ValueCountFrequency (%)
& 20722
93.1%
, 1537
 
6.9%
Space Separator
ValueCountFrequency (%)
91390
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1578
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 945658
89.1%
Common 115227
 
10.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 105034
 
11.1%
i 80659
 
8.5%
t 71524
 
7.6%
a 71307
 
7.5%
s 69041
 
7.3%
r 61659
 
6.5%
n 59784
 
6.3%
o 49069
 
5.2%
c 46157
 
4.9%
l 43205
 
4.6%
Other values (34) 288219
30.5%
Common
ValueCountFrequency (%)
91390
79.3%
& 20722
 
18.0%
- 1578
 
1.4%
, 1537
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1060885
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 105034
 
9.9%
91390
 
8.6%
i 80659
 
7.6%
t 71524
 
6.7%
a 71307
 
6.7%
s 69041
 
6.5%
r 61659
 
5.8%
n 59784
 
5.6%
o 49069
 
4.6%
c 46157
 
4.4%
Other values (38) 355261
33.5%

SPCODE
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
SP
49069 
EX
 
178

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters98494
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSP
2nd rowSP
3rd rowSP
4th rowSP
5th rowSP

Common Values

ValueCountFrequency (%)
SP 49069
99.6%
EX 178
 
0.4%

Length

2023-05-08T00:16:00.446676image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-08T00:16:00.515130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
sp 49069
99.6%
ex 178
 
0.4%

Most occurring characters

ValueCountFrequency (%)
S 49069
49.8%
P 49069
49.8%
E 178
 
0.2%
X 178
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 98494
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 49069
49.8%
P 49069
49.8%
E 178
 
0.2%
X 178
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 98494
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 49069
49.8%
P 49069
49.8%
E 178
 
0.2%
X 178
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 98494
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 49069
49.8%
P 49069
49.8%
E 178
 
0.2%
X 178
 
0.2%

TICKER
Categorical

Distinct495
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size384.9 KiB
CME
 
264
TFC
 
179
GE
 
170
WFC
 
161
BAC
 
157
Other values (490)
48316 

Length

Max length5
Median length3
Mean length3.1061994
Min length1

Characters and Unicode

Total characters152971
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAAL
2nd rowAAL
3rd rowAAL
4th rowAAL
5th rowAAL

Common Values

ValueCountFrequency (%)
CME 264
 
0.5%
TFC 179
 
0.4%
GE 170
 
0.3%
WFC 161
 
0.3%
BAC 157
 
0.3%
PNC 156
 
0.3%
KO 154
 
0.3%
C 153
 
0.3%
USB 153
 
0.3%
CBOE 152
 
0.3%
Other values (485) 47548
96.6%

Length

2023-05-08T00:16:00.581613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cme 264
 
0.5%
tfc 179
 
0.4%
ge 170
 
0.3%
wfc 161
 
0.3%
bac 157
 
0.3%
pnc 156
 
0.3%
ko 154
 
0.3%
c 153
 
0.3%
usb 153
 
0.3%
cboe 152
 
0.3%
Other values (485) 47548
96.6%

Most occurring characters

ValueCountFrequency (%)
C 11320
 
7.4%
A 11113
 
7.3%
M 9543
 
6.2%
T 9354
 
6.1%
S 8993
 
5.9%
L 8310
 
5.4%
E 8213
 
5.4%
R 8019
 
5.2%
P 7764
 
5.1%
N 7352
 
4.8%
Other values (17) 62990
41.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 152764
99.9%
Other Punctuation 207
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 11320
 
7.4%
A 11113
 
7.3%
M 9543
 
6.2%
T 9354
 
6.1%
S 8993
 
5.9%
L 8310
 
5.4%
E 8213
 
5.4%
R 8019
 
5.2%
P 7764
 
5.1%
N 7352
 
4.8%
Other values (16) 62783
41.1%
Other Punctuation
ValueCountFrequency (%)
. 207
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 152764
99.9%
Common 207
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 11320
 
7.4%
A 11113
 
7.3%
M 9543
 
6.2%
T 9354
 
6.1%
S 8993
 
5.9%
L 8310
 
5.4%
E 8213
 
5.4%
R 8019
 
5.2%
P 7764
 
5.1%
N 7352
 
4.8%
Other values (16) 62783
41.1%
Common
ValueCountFrequency (%)
. 207
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 152971
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 11320
 
7.4%
A 11113
 
7.3%
M 9543
 
6.2%
T 9354
 
6.1%
S 8993
 
5.9%
L 8310
 
5.4%
E 8213
 
5.4%
R 8019
 
5.2%
P 7764
 
5.1%
N 7352
 
4.8%
Other values (17) 62990
41.2%

SUB_TELE
Real number (ℝ)

Distinct150
Distinct (%)0.3%
Missing168
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean536.79087
Minimum31
Maximum989
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:16:00.680345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum31
5-th percentile206
Q1312
median513
Q3737
95-th percentile949
Maximum989
Range958
Interquartile range (IQR)425

Descriptive statistics

Standard deviation251.99508
Coefficient of variation (CV)0.4694474
Kurtosis-1.217201
Mean536.79087
Median Absolute Deviation (MAD)207
Skewness0.074568053
Sum26345159
Variance63501.52
MonotonicityNot monotonic
2023-05-08T00:16:00.780925image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
212 3687
 
7.5%
408 1877
 
3.8%
713 1386
 
2.8%
650 1177
 
2.4%
847 1132
 
2.3%
972 1078
 
2.2%
800 1055
 
2.1%
353 1052
 
2.1%
703 1021
 
2.1%
312 972
 
2.0%
Other values (140) 34642
70.3%
ValueCountFrequency (%)
31 116
 
0.2%
41 377
0.8%
44 321
 
0.7%
201 336
 
0.7%
202 244
 
0.5%
203 902
1.8%
205 110
 
0.2%
206 675
1.4%
207 82
 
0.2%
208 101
 
0.2%
ValueCountFrequency (%)
989 10
 
< 0.1%
985 74
 
0.2%
980 202
 
0.4%
978 75
 
0.2%
973 486
1.0%
972 1078
2.2%
952 193
 
0.4%
951 63
 
0.1%
949 310
 
0.6%
941 78
 
0.2%

NAICS
Real number (ℝ)

Distinct205
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean382695.93
Minimum42
Maximum999977
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:16:00.883236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum42
5-th percentile2211
Q1325412
median339113
Q3522210
95-th percentile561450
Maximum999977
Range999935
Interquartile range (IQR)196798

Descriptive statistics

Standard deviation182331.85
Coefficient of variation (CV)0.47644053
Kurtosis0.47772293
Mean382695.93
Median Absolute Deviation (MAD)174097
Skewness-0.5272223
Sum1.8846627 × 1010
Variance3.3244904 × 1010
MonotonicityNot monotonic
2023-05-08T00:16:00.987499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
522110 2596
 
5.3%
22111 1629
 
3.3%
334413 1472
 
3.0%
2211 1466
 
3.0%
531120 1333
 
2.7%
524126 1269
 
2.6%
325412 1162
 
2.4%
2111 928
 
1.9%
531110 863
 
1.8%
518210 863
 
1.8%
Other values (195) 35666
72.4%
ValueCountFrequency (%)
42 110
 
0.2%
111 11
 
< 0.1%
315 209
 
0.4%
321 98
 
0.2%
325 114
 
0.2%
423 106
 
0.2%
621 109
 
0.2%
2111 928
1.9%
2211 1466
3.0%
3113 104
 
0.2%
ValueCountFrequency (%)
999977 274
0.6%
812331 67
 
0.1%
722513 418
0.8%
722511 185
0.4%
721120 281
0.6%
721110 167
 
0.3%
713210 36
 
0.1%
711320 132
 
0.3%
622110 158
 
0.3%
621511 188
0.4%

SPINDEX
Real number (ℝ)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3484.1592
Minimum1010
Maximum6010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:16:01.076645image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1010
5-th percentile1510
Q12510
median3520
Q34510
95-th percentile6010
Maximum6010
Range5000
Interquartile range (IQR)2000

Descriptive statistics

Standard deviation1349.3765
Coefficient of variation (CV)0.38728898
Kurtosis-0.88633474
Mean3484.1592
Median Absolute Deviation (MAD)1000
Skewness0.039699725
Sum1.7158439 × 108
Variance1820816.9
MonotonicityNot monotonic
2023-05-08T00:16:01.155270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
2010 4390
 
8.9%
5510 3395
 
6.9%
3510 3260
 
6.6%
4510 3070
 
6.2%
4020 3005
 
6.1%
1510 2847
 
5.8%
6010 2628
 
5.3%
4030 2576
 
5.2%
3520 2569
 
5.2%
3020 2346
 
4.8%
Other values (14) 19161
38.9%
ValueCountFrequency (%)
1010 2187
4.4%
1510 2847
5.8%
2010 4390
8.9%
2020 910
 
1.8%
2030 1506
 
3.1%
2510 542
 
1.1%
2520 1261
 
2.6%
2530 1597
 
3.2%
2550 2049
4.2%
3010 589
 
1.2%
ValueCountFrequency (%)
6010 2628
5.3%
5510 3395
6.9%
5020 1611
3.3%
5010 429
 
0.9%
4530 1854
3.8%
4520 1652
3.4%
4510 3070
6.2%
4030 2576
5.2%
4020 3005
6.1%
4010 2336
4.7%

SIC
Real number (ℝ)

Distinct178
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4822.9585
Minimum100
Maximum9997
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.9 KiB
2023-05-08T00:16:01.246677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile2030
Q13572
median4911
Q36282
95-th percentile7373
Maximum9997
Range9897
Interquartile range (IQR)2710

Descriptive statistics

Standard deviation1839.929
Coefficient of variation (CV)0.38149384
Kurtosis-0.67119573
Mean4822.9585
Median Absolute Deviation (MAD)1351
Skewness0.10163571
Sum2.3751624 × 108
Variance3385338.6
MonotonicityNot monotonic
2023-05-08T00:16:01.349245image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6020 2562
 
5.2%
6798 2422
 
4.9%
7370 1741
 
3.5%
4931 1606
 
3.3%
3674 1472
 
3.0%
4911 1376
 
2.8%
6331 1218
 
2.5%
2834 1162
 
2.4%
7372 1053
 
2.1%
1311 922
 
1.9%
Other values (168) 33713
68.5%
ValueCountFrequency (%)
100 11
 
< 0.1%
1000 104
 
0.2%
1040 111
 
0.2%
1311 922
1.9%
1389 254
 
0.5%
1400 204
 
0.4%
1531 360
 
0.7%
1731 94
 
0.2%
2000 111
 
0.2%
2011 217
 
0.4%
ValueCountFrequency (%)
9997 386
0.8%
8742 114
 
0.2%
8731 152
 
0.3%
8721 87
 
0.2%
8700 89
 
0.2%
8090 94
 
0.2%
8071 188
0.4%
8062 158
0.3%
8000 109
 
0.2%
7990 317
0.6%

Interactions

2023-05-08T00:15:54.774141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:36.798263image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.138529image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.608696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.894953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.290691image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.720933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.061764image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.411788image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.937027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.314061image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.582452image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.110015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.450499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.869630image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:36.903485image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.231695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.703595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.997867image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.381051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.818180image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.155769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.506338image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.037505image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.407238image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.680972image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.224061image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.546743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.961309image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:36.998197image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.323368image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.797199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.099322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.474540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.915480image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.248678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.600805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.137361image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.498552image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.777440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.316121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.640815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.047247image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.089331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.414663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.888398image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.191734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.560433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.007703image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.340177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.692838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.230256image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.583104image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.864270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.405624image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.742257image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.146603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.193739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.514438image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.989895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.299508image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.657213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.110281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.441906image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.798233image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.337591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.684143image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.964965image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.512502image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.849711image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.235340image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.283700image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.605515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.075867image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.395076image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.882276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.201663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.540820image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.097708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.431628image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.772664image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.055773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.602385image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.939108image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.329812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.384250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.702142image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.172629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.498017image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.994367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.298666image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.641139image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.208919image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.533203image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.864124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.152660image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.697748image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.036285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.426381image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.481315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.799004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.262777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.602969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.084777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.396200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.739062image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.298904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.632621image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.956676image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.247643image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.792572image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.125175image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.514144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.575510image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.888999image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.350167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.698291image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.170829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.485614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.828973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.383696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.726014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.042328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.336442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.880314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.212855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.615515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.679164image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.999451image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.448452image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.804907image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.269623image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.590793image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:45.934096image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.481555image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.832565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.139513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.437786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:52.982522image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.310459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.704704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.769038image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.097854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.535695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.899441image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.356793image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.683514image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.031356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.572978image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:48.928716image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.227665image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.524692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.075071image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.397993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.792750image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.863405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.189734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.627576image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:41.997184image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.451932image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.777611image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.129530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.666671image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.027437image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.314966image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.622018image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.164918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.491704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.881510image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:37.954160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.280033image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.715453image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.096064image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.545349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.873601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.224670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.756548image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.123362image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.404276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.716499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.259454image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.583291image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:55.975144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:38.046493image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:39.503533image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:40.806575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:42.191218image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:43.633709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:44.965723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:46.316310image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:47.846641image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:49.218884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:50.492722image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:51.809467image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:53.348292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-05-08T00:15:54.679579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2023-05-08T00:16:01.440599image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
GVKEYDIRNBRCASH_FEESSTOCK_AWARDSOPTION_AWARDSNONEQ_INCENTPENSION_CHGOTHCOMPTOTAL_SECYEARSUB_TELENAICSSPINDEXSICEXCHANGESTATESPCODE
GVKEY1.000-0.089-0.1480.0050.060-0.057-0.099-0.203-0.0210.027-0.0200.2740.1270.2310.1740.2410.060
DIRNBR-0.0891.0000.0490.015-0.069-0.0040.0170.0600.0110.028-0.022-0.020-0.008-0.0120.0670.0610.000
CASH_FEES-0.1480.0491.0000.190-0.112-0.0270.0610.1970.5230.171-0.004-0.089-0.083-0.1040.0070.0190.000
STOCK_AWARDS0.0050.0150.1901.000-0.302-0.006-0.0160.0560.6870.2860.015-0.0080.027-0.0020.0030.0000.000
OPTION_AWARDS0.060-0.069-0.112-0.3021.0000.001-0.011-0.1070.066-0.1430.0220.019-0.003-0.0110.0240.0000.000
NONEQ_INCENT-0.057-0.004-0.027-0.0060.0011.0000.0670.0420.008-0.017-0.045-0.035-0.064-0.0550.0040.0000.000
PENSION_CHG-0.0990.0170.061-0.016-0.0110.0671.0000.0760.052-0.045-0.003-0.0830.031-0.0320.0200.0360.000
OTHCOMP-0.2030.0600.1970.056-0.1070.0420.0761.0000.227-0.0200.031-0.113-0.147-0.1370.0000.0000.000
TOTAL_SEC-0.0210.0110.5230.6870.0660.0080.0520.2271.0000.2800.040-0.031-0.017-0.0390.0030.0000.000
YEAR0.0270.0280.1710.286-0.143-0.017-0.045-0.0200.2801.0000.0020.0040.0010.0050.0000.0000.000
SUB_TELE-0.020-0.022-0.0040.0150.022-0.045-0.0030.0310.0400.0021.000-0.091-0.001-0.0040.1520.5420.108
NAICS0.274-0.020-0.089-0.0080.019-0.035-0.083-0.113-0.0310.004-0.0911.0000.2670.7750.1430.3820.082
SPINDEX0.127-0.008-0.0830.027-0.003-0.0640.031-0.147-0.0170.001-0.0010.2671.0000.4730.2670.3560.181
SIC0.231-0.012-0.104-0.002-0.011-0.055-0.032-0.137-0.0390.005-0.0040.7750.4731.0000.1890.3390.165
EXCHANGE0.1740.0670.0070.0030.0240.0040.0200.0000.0030.0000.1520.1430.2670.1891.0000.3170.096
STATE0.2410.0610.0190.0000.0000.0000.0360.0000.0000.0000.5420.3820.3560.3390.3171.0000.109
SPCODE0.0600.0000.0000.0000.0000.0000.0000.0000.0000.0000.1080.0820.1810.1650.0960.1091.000

Missing values

2023-05-08T00:15:56.183759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-05-08T00:15:56.672002image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-05-08T00:15:56.974181image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

GVKEYDIRNBRDIRNAMECASH_FEESSTOCK_AWARDSOPTION_AWARDSNONEQ_INCENTPENSION_CHGOTHCOMPTOTAL_SECYEARCONAMECUSIPEXCHANGEADDRESSCITYSTATEZIPSICDESCNAICSDESCINDDESCSPCODETICKERSUB_TELENAICSSPINDEXSIC
010451Roger T. Staubach37.024.1030.00.00.0008.34969.4522010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
110452Ann McLaughlin Korologos39.018.9490.00.014.4858.52180.9552010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
210453Judith Rodin, Ph.D.37.024.1030.00.00.00015.07876.1812010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
310454David L. Boren39.018.9490.00.016.3003.07877.3272010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
410455Ray M. Robinson, Jr.39.024.1030.00.00.0009.63072.7332010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
510456Armando M. Codina37.018.9490.00.013.3385.43374.7202010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
610457Michael A. Miles39.024.1030.00.00.0005.22368.3262010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
710458John W. Bachmann39.024.1030.00.00.00017.55780.6602010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
810459Rajat Kumar Gupta38.024.1030.00.00.0008.82770.9302010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
9104510Philip J. Purcell, III39.024.1030.00.00.0005.27268.3752010AMERICAN AIRLINES GROUP INC02376R10NAS1 Skyview DriveFort WorthTX76155AIR TRANSPORTATION, SCHEDULEDScheduled Passenger Air TransportationAirlinesSPAAL682.048111120304512
GVKEYDIRNBRDIRNAMECASH_FEESSTOCK_AWARDSOPTION_AWARDSNONEQ_INCENTPENSION_CHGOTHCOMPTOTAL_SECYEARCONAMECUSIPEXCHANGEADDRESSCITYSTATEZIPSICDESCNAICSDESCINDDESCSPCODETICKERSUB_TELENAICSSPINDEXSIC
492373160564Carla Cico140.000100.0330.00.00.00.0240.0332018ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492383160565Charles L. Szews103.846100.0330.00.00.00.0203.8792018ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492393160566Dean I. Schaffer152.000100.0330.00.00.00.0252.0332018ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492403160567Michael J. Chesser15.1670.0000.00.00.00.015.1672018ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492413160561Nicole Parent Haughey140.000100.0670.00.00.00.0240.0672019ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492423160562Kirk S. Hachigian165.000100.0670.00.00.00.0265.0672019ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492433160563Martin E. Welch, III155.000100.0670.00.00.00.0255.0672019ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492443160564Carla Cico140.000100.0670.00.00.00.0240.0672019ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492453160565Charles L. Szews140.000100.0670.00.00.00.0240.0672019ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420
492463160566Dean I. Schaffer152.000100.0670.00.00.00.0252.0672019ALLEGION PLCG0176J10NYSIveagh Court, Block D, Harcourt RoadDublinNaND02 VCUTLERY, HANDTOOLS, AND GENERAL HARDWAREHardware ManufacturingBuilding ProductsSPALLE353.033251020103420